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ABSTRACT 

Validation in curriculum development is the 
••check-and- balance" dimension of any instructional system, in th^ 
broadest sense almost synonymous with evaluation and accountability. 
This paper relates validation to individual, formative, and summative 
evaluation. Validation measures to be applied to instructional 
systems are outlined according to a 12-point model reported by F« 
Coit Butler. Curriculum development is concerned with 
criterion- referenced tests (CRT) and the CRT is central to all 
validation efforts. The paper discusses validity of the curriculum 
generally and of the CRT specifically with reference'to reliability 
and other factorsJ The appendix consists of instructional systems 
development charts from various sources. (MF) 
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INTRODUCTION 

This paper looks at validation as one of the many asp«^;.LS of the 
comprehensive and complex process of curriculum development In voca- 
tional and technical education. Validation Is referred to here as an 
aspect of the curriculum development process as opposed to Its being a 
step In the process. The reason for this Is that validation measures are 
taken In more than one of the steps In the process. Also, validation 
measures are taken for more than one reason. Hence, validation as a 
concept Is rather pervasive. Lee J.. Cronbach suggests that validation 
Is more than the process of examining the accuracy of a specific predic- 
tion or Inference from a test score; validation means to demonstrate the 
worth of; to validate Is to Investigate . 

Validation may be viewed as the "mortar'' that holds together the | 
"bricks" of curriculum development. It is the "check-and-balance" dimension 
of any Instructional system. Validation Is Important to both job/ task 
analysis and to deriving behavioral-performance objectives. Certain 
types of validation measures are taken during design and tryout of 
materials, during the conduct of training after Implementation, at the 
end of a training action, and even on the job after the trained Individual 
eaters work. 
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In the broadest sense ^ validation Is almost synonomous with 
"evaluation" and "accountability." Validation as used in this paper is 
directly related to individual evaluation, formative evaluation, and 
summatlve evaluation ♦ Distinctions between these three are as follows: 

Individual evaluation is the monitoring of the performance of 
each learner as a basis for making decisions about his further vertical 
progression in a particular sequence, or his transfer laterally to other 
sequences. This is usually done by frequent monitoring of learner 
performance. 

formative evaluation is conducted during the experimental period 
and provides feedback for Improvement of the instructional package. 
Formative evaluation is the collection of appropriate evidence during 
the construction of a new curriculum in such a way that revisions of 
the curriculum can be made on evidence. It is ongoing and is carried 
out. concurrently with the instruction. It is distinct from individual 
evaluation In that the focus is on the instructional system Itself 
rather than the learner. It seems then that "validation" is "formative 
evaluation." Formative tests have two purposes: (1) to find out how 
much students have learned in a restricted area of content, and (2) to 
assess whether instruction has been properly designed and conducted. 
Design as used here refers to appropriate content, sequence, and method 
of instruction. 

Summatlve evaluation is the overall assessment of a final instruc- 
tlonal package. Summatlve evaluation is the collection of data at the 
end of a training program to determine its effectiveness. It does not 
occur during the design, but rather subsequent to development, refinement, 
and implementation. 
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The primary purpose behind conducting validation procedures Is to 
determine If the planned curriculum will achieve the responses for which 
It was designed. Validation searches for evidence to Indicate that the 
curriculum can cause Individuals to achieve Its predetermined behavioral 
objectives. Validation measures should, for the most part, show that 
when objectives are not achieved, the fault lies with the Instructional 
system, not with the student. 

Without validation measures being applied, curriculum design Is 
greatly handicapped. The curriculum designer usually uses only subjec- 
tive and/or personal judgment In job/task ansdysls, formulation of 
behavioral objectives, and even In the construction of a criterion test. 
Indeed, in the early stages of instructional design, personal opinions 
and judgments of experienced persons as to what ought to be included in 
a course are valuable and necessary at the outset. However, observation, 
intuition, interviews, expertise, and historical precedent serve to 
support the designer just so long. They can serve to set early patterns 
for the designer. Thie instructional systems concept, on the other hand, 
demands empirical evidence that is derived through objective evaluation 
of content before further course design takes place. The criterion test 
is constructed to objectively validate course content, but its items must 
be validated before it can be so used. 

Historically speaking, the judgment of the iaotructor many years ago 
was regarded as a criterion . Many instructors continue to rely on 
personal judgment as the criterion for constructing test items, and claim 
that when coin)leted, they have an "obviously valid test." The "obviously 
valid test" cannot be validated by correlating it with something else. 



However, It can be Improved by applying the factor analysis technique to 
criterion variables. 

The writer of this paper does not wish to convey the Idea that all 
curriculum designers should become "empiricists." Neither should we go 
In the other direction and become total "Impressionists." The empiricist 
validates In a completely objective, formal manner. He Insists on having 
numerical scores, no matter how crude his Instrument, to be Interpreted 
within minimal errors of measurement, and with predictions having Indexes 
showing how likely it is to come true. The impressionist , on the other 
hand, bases his analyses on observations and informal measures and 
estimates. He intuitively compares impressions based on one procedure 
with impressions gained from another. He can be classified as being 
somewhat more casual in his validation efforts. 

Validation in vocational*technical curriculum design requires the 
designer to be somewhere on the continuum between the empiricist on the 
one hand and the impressionist on the other. He should be intermediate 
between obsession with scores and unrestrained use of intuition. Formal 
objective procedures should be combined with informal judgment in all 
validation efforts. 

ASSUMPTIONS 

For purposes of this paper , several assumptions should be made 
before proceeding with further discussion about validation. These are 
stated here to get us on common ground so that we might begin thinking 
along similar lines. 

First, the assumption should be made that the curriculum designer 
has decided to utilize some type of systems approach or model in the 



development of his curriculum and its components. Leonard C. Silvern / 
defines a system as: "the structure or organization of an orderly whole, 
clearly showing the Interrelationship of the parts to each other and to 
the whole itself." Curriculum development involves a step-by-step process, 
and a system best accommodates a process, or cycled steps. The use of a 
systems approach implies comprehensiveness of steps, as well as inter- 
dependence of stages, coiqionents, and concepts. Systems analysis techniques 
enable the designer to better select the stage (time sequence) of the 
program operation he must validate, i.e., to identify the relevant 
curriculum components with the outcome changes being measured. 

The use of a systems approach assures that all the necessary assess- 
ments will be made. For example, some measure should be made of student 
performance or entering behavior before (pre-assessment) he begins a new 
curriculum. Similarly, two types of assessment should be continually 
made while the student Is undergoing training (during-instruction) . One 
of these types of assessment serves as feedback for reinforcement, the 
other assures acquisition of behaviors that are prerequisite to lateral 
movement to other experiences. Likewise, two types of assessments should 
be made after completion of a lesson or a unit (post-assessment). One aids 
in determining If a student is prepared* for vertical progression to related 
or advanced experiences; the other . serves as a type of summatlve evaluation, 
as well as a predictor of success on the job or in more advanced courses. 

Constant improvement of curriculum is a worthy goal. Hence, the 
influence and use of pre-^assessment is an important variable for valida- 
tion since it Is not the terminal criterion behavior alone which dictates 
required Instructional manipulation, but the differences between entering 
and terminal behavior. 



Apparently there are as many systems models as there are curriculum 
designers. However, most all of them have some common features. These 
commonalities usually resemble the following: 

1. Job specification or analysis, 

2. Specification of objectives. 

3. Development of preliminary system design. 

4. Development — test — revise cycle applied to the system. 

5. Implementation and field testing the system. 

6. Follow-up and/or summatlve evaluation. 

Included in the appendices are some excellent examples of models 
that persons have prepared to graphically depict curriculum development 
processes. Note the similarities. Some are specific concerning valida- 
tion measures; some are not. It should be made explicit here that no one 
model fits all situations. Models are necessarily going to be different 
for secondary programs, for post-secondary programs, for industry-based 
programs, for government agency-based training, for military-based training, 
etc. Hence, it behooves the curriculum designer to become knowledgeable 
in principles of systems development if he is to achieve assurances that 
his curriculum will be valid. One of these models will be used later in 
this paper in an attempt to show where various validation measures will 
be taken, and to show how scores taken at one point in the system are 
correlated with those taken at another. 

There are other "stage-setting" assumptions which need to be made at 
this point in the paper. A second major assumption is that the 
curriculum designer has selectively applied a number of principles of 
learning, because different kinds of learning require different sets 



of conditions. The important factors which influence learning are: 
motivation, organization, participation, confirmation » repei^ition, and 
application. One type of learning may require emphasis on one factor, 
whereas another type of learning may require two or more factors in 
concert. 

A third assinnption is that the designer intends to build the 
curriculum so that the sequence of learning progresses from the simple 
to the complex. The sequence or hierarchy should resemble the following: 
specific responses and associations which are prerequisite to verbal and 
motor chains which are prerequisite to discriminations which are pre- 
requisite to concepts which are prerequisite to principles which are 
prerequisite to higher-order principles or strategies for problem-solving. 

Fourth, it is assumed that enabling or interim objectives for lessons, 
modules, or units hayd been appropriately derived, and are of a degree 
of specificity such* that the materials can validated accordingly. Like- 
wise, it is assumed that terminal (course or training program) objectives 
also have been, appropriately stated in such a manner that validation 
measures also contribute to overall or summative evaluation and 
accountability in programs and projects. One striking advantage of 
precisely stated objectives is that when one is completely clear about 
the nature of the terminal behavior, it is possible to arrange for 
appropriate practice opportunities during the instructional sequence. 

A fifth assumption is that we can at least tentatively agree that 
the essence of education focuses on preparing persons so they might be 
enabled to attack all their problems by bringing knowledge and action 
to bear on them in a unified ';nd integrated rather in a fragmented 



manner • Does the student who finishes your curriculum merely possess 
a liarge bewildering body of unrelated facts? Or, can he articulate 
knowledge and skills learned so that he can perform? Validation through 
the use of criterion referenced tests assures performance. 

Such all-encompassing assumptions may be misleading. They may 
causa some people to feel that further discussion of validation is 
unnecessary since so much has been '^assumed." This is not the case 
because these assumptions touch only a small portion of the elements 
of the entire instructional system. 

Validation involves measurement. Appropriate validation measures 
do not allow wide fluctuation in attainment of objectives, nor do they 
bring about perfect stability . Validation does aid in better control 
of achievement of objectives. When combined with appropriate definitions 
of behavior changes sought, validation provides the. curriculum designer 
with a thermostat to acsure the achievement of the instructional 
objectives. Hence, the curriculum designer "controls" the growth or 
behavior change of the student. The designer usually starts with a 
comprehensive description of the desired dependent set of events, i.e., 
a finished product or process derived by job /task analysis. Then he 
works backward through his analysis to the set of independent events 
most likely to produce the product or process. 

Validation procedures have value in many of the stages of the system. 
However, their greatest value probably occurs when employed in the design 
stages of materials development, in which they are applied to both 
interim as well as terminal objectives. Hence, validation becomes the 
prime focus in checking out the objectives and the criterion tests, 
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as well as later in pilot tests and field trial assessments of 
materials. However, after validation is completed during design- 
development and the training is installed in the classroom, testing 
for attainment of enabling or interim objectives may become a matter 
of self-testingvhy the student. Formal testing for attainment of 
tetminal objectives is conducted by the instructor and these scores 
might be used to further validate the curriculum. 

In the design stage the curriculum developer may wish to use a 
simple cycle such as: design — test — revise — retest. The use of 
this cycle tends to upgrade the effectiveness of the instructional 
materials through repeated revision. Most of the models shown in this 
paper contain some type of design — test — revise — retest dimension 
or component. 

I 

VALIDATION IN INSTRUCTIONAL SYSTEMS. 

Validation in the design of curriculum materials demands a systems 
approach which will ensure that testing is conducted at the right 
steps in the overall process. Some, curriculum experts have designated 
these efforts with a generic term: "developmental validation." 

For purposes of ordering our thinking about validating curriculums, 
we need to utilize a systems model which 'sequences the events that are 
necessary to produce a valid curriculum. The model recehdy reported 
by F* Coit Butler will serve this purpose. 

Butler's system is briefly given as follows: 

1. Conduct feasibility study This requires an analysis 

of trends with regard to job markets and occupational patterns; 

trends In economic, business, agricultural, and industrial 



expansion; types of jobs and worker competencies needed; availability 
of training programs and facilities, and their costs; etc. 

2. Conduct task analysis — After the decision has been made 
that a specific training program or course Is needed, an occupational 
or job analysis Is conducted to determine skills and knowledges 
required; kinds and levels of performances demanded by the job, etc. 

3. Develop training objectives At this point the designer 

must derive explicit statements about what a student, upon completion 
of the training program, must be able to do; the conditions and 
standards of his performance; etc. Both terminal (unit, course, 
program) objectives and Interim or enabling (lesson, activity, 
module) objectives must be specified. These may be directly 
coupled to broad goal statements and possibly even broader educa- 
tional or philosophical constructs. 

4. Develop criterion tests These are used in the early 

stages of design to determine validity of the objectives, and 
later to help perform summative evaluations of the entire course 
or training program. Additional information about criterion 
tests is included later in this paper. 

5. Validate the criterion test This is done by adminis- 
tering the test to an untrained-unskilled group and to a trained- 
skilled group and correlating the scores to obtain validity and 
reliability coefficients. Test item analysis at this point 
calls for interpretations similar to the following: (a) if, for 
a given test item, the majority of untrained group responses 

are correct, the item has little or no validity or reliability; 
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and conversely, (b) if, for a given test item, the majority of <^ 
trained group responses are incorrect, the item likewise has little 
or no validity or reliability. T^ec meaaurement and statistics 
texts listed in the References section of this paper contain 
procedures for the test item analysis. This analysis is conducted 
to improve the test as a measuring instrument. This is the 
most important validation step in the process. 

6. Validate training objectives The test should contain 

at least one item for each objective, and possibly not more than 
five items for each objective, otherwise the test becomes too 
long for practical purposes. Validating the test in Step 5 
above and validating training objectives can be accomplished 
concurrently, provided the test item itself is not at fault. 
Interpretations similar to those made in Step 5 aie em.oloyed 
in this step; e.g., (1) if, for a given test item and its 
companion objective, the majority of untrained group responses 
are correct, there may be no need to include that objective 
in the curriciaum; and, (2) if, for a given test item and its 
companion objective, the majority of trained groxxp responses 
are incorrect, there may be no need to include that objective 
in the course because, apparently, the worker can perform 
on the job without that knowledge or skill. 

(These types of interpretatioTis may need to be reviewed in light 
of some estimates concerning the possibilities and probabilities 
that a worker may be required to "transfer" skills and knowledges 
to a different work situation^ However, if this becomes probable 
then the situation may warrant a new re-training or up-grading 
instructional progr&m which calls for it to undergo the same 
validation procedures outlined here.) 
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According to Butler's model, the Initial design phase has been 
completed at this point, but the remaining phases also require 
validation considerations. 

7. Develop learning sequence and structure This Is 

done according to the duties, tasks, and activities provided In 
the job/task analysis. The following sketch shows a pyramidal 
form of learning structure and sequence. 

Job 



ERIC 



I : 1 

Duties 1 2 



I 1 1 I 1 

Tasks 1.1 1.2 1.3 2.1 2.2 

1—^ r-L-1 1-^ r-^— I I— ^ 

Activities I.I.I 1.13 lAt tjta lAt IA2 2.1.1 2.141 2.13 23.1 233 etC. 

Activities, tasks, and duties are structured (and learned) in both 
a vertical and horizontal sequence. The learning of one Is 
dependent upon accomplishment of those which precede It. Most 
curriculum experts recognize that sequencing must be approached 
with a great deal of flexibility . The general guideline of 
efficiency should Influence sequencing. 

Butler sets forth a matrix analysis technique for preparing 
the course outline In which supporting knowledges and skills for 
activities, tasks, and duties are listed. He Indicates that the 
learning sequence can* be plotted by starting with the terminal 
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obj;2Ctlve and working backward through each preceding prerequisite — 
In essence, from the coiq>lex back to the simple. Butler suggests 
listing all terms, concepts, rules, and principles which pertain 
to each objective. List them as single-fact statements and assign 
each a number. Each number Is then placed In a two-dimensional 
matrix (discrimination-association) along a diagonal line from top 
left to bottom right. Associations then are marked In the common 
squares above the diagonal, and discriminations are marked In the 
common squares below the diagonal. By shuffling and reshuffling, 
a rearranged matrix can be plotted which depicts an optimum 
clustering of discriminations and associations around the diagonal, 
which results In the best sequencing. The clusters tend to 
depict broad concepts In the curriculum. 

Validating the sequence also Is accomplished with the criterion 
test which has been validated and revised. The test Is given to 
a group (30 to 50) of trained Individuals, I.e., as a post-test 
to persons who just completed the program, or to those who have 
been on the job about six months. In the analysis of these scores, 
one looks for the dependency and Inter dependency between and among 
units, lessons, or fairly large blocks of curriculum content. 

Butler Indicates that the test data should be analyzed 
with two basic questions In mind: (1) Did the majority of those 
students who correctly performed a subordinate unit (Unit No. 1) 
also correctly perform the following and supposedly dependent 
unit (Unit No. 2)? And, (2) Did the majority of those who 
correctly performed the higher unit (Unit No. 2) also perform 




the subordinate unit (Unit No. 1) correctly? If, for a tested 
trained sample, the answers to both questions are affirmative, 
then the sequence is valid. If, for only 85% of the sample, 
the answers are affirmative, then the sequence is probably valid. 
See "Validating Content Sequence" chart on the following page 
for implications when 50% or less of the trained sample perform 
incorrectly. 

The foregoing procedure is used only on a pair of tasks in 
a hierarchy. Suppose the hierarchy consisted of three or more 
tasks and validation is still required. Recent research has gone 
in the'^^rection of trying to discover such hierarchies and their 
properties, and validation procedures are under study, using 
factor analysis techniques. The curriculum desigtier may wish to 
refer to "A Method for Validating Sequential Instructional 
Hierarchies," by P. W. Airasian, in the December 1971, issue of 
Educational Technology Journal . This method is based on calcu- 
lation of conditions^, item difficulty indices and facilitates the 
pinpointing of sequential levels within a hierarchy which require 
revision. 

8. Develop learning strategies This step has no 

feasible V£^.idation procedures which are not too costly and time 
consuming to use. Miedia are selected according to those that will 
do an effective job for the least cost. Combinations of the 
different media usually should be considered. 

Validation is influenced by the media. Test scores may be 
low for students with reading problems, but the same test scores 
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Validating Content Sequence 



Summary of procedure for analyzing criterion test data from a sample 
trained population when validating content structure and sequence 

Trained Sample Performance Implications 



(only correct performance) 



Performs unit (100%) 


85% perform sub-unit 


Possible correct sequence 


Performs sub-unit (100%) 


85% perform unit 


Possible correct sequence 






Taken together, a 
certain t:y the sequence 
Is correct. 


Performs unit (100%) 


85% perform sub-*unlt 


Possible correct sequence 


Performs sub-unit (100%) 


50% fall to perform 
unit 


Possible Incorrect sequence 






Taken together, Indicates 
bad test Item. 


Performs unit (100%) 


50% fall to perform 
sub-unit 


Possible Incorrect sequence 


Performs sub-unit (100%) 


85% perform unit 


Possible correct sequence 






Taken together, Indicates 
bad test Item. 


Performs unit (100%) 


50% fall to perform 
sub-unit 


Possible Incorrect sequence 


Performs sub-unit (100%) 


50% fall to perform 
unit 


Possible Incorrect sequence 






Taken together, a certainty 
the sequence Is Incorrect. 



Source: F. Colt Butler. Instructional Systems Developiaent for 

Vocational and Technical Training ^ Educational Technology 
Publications, Englewood Cliffs, New Jersey, 1972. 
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may be Improved by using audio media instead of printed media. 
The objectives and student learning styles are the prime determi- 
nants in developing the learning strategies. 

9. Develop instructional units (lessons) This is the 

point where a test model of the instructional system is produced. 
Two documents are needed: (1) the system development plan, and 
(2) the instructor's manual or guide. 

The system development plan contains: (1) task analysis 
summary forms; (2) validated objectives in validated sequence, 
supported by a summary of the validation data; (3) validated 
criterion test items in validated sequence, supported by a 
summary of the validation data; (4) outline of instructional 
strategies with associated content (objectives) identified; and 
(5) production and testing plans for the system. 

The design and format of the individual learning units may 
vary greatly, but each should contain the following: (1) the 
performance objectives; (2) the knowledges and skills to be 
gained; (3) a list of tools, equipment, supplies, references, 
etc., needed for the unit; (4) a learning activity guide; 

(5) interim progress checks and student self-evaluations; and 

(6) an instnmienC to serve as a pre-test and/or a post-test for 
evaluations by the instructor. 

10. Validate learning units At this point each unit 

is tested and revised until 85% of sample trainees reach the 
criterion. 
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Revision may require resequenclng and adoption of new learning 
strategies. Initial testing Is done on an individual or one-to- 
one basis, with two or three sample trainees who have upper-level 
ability. Minor revisions may be made at this point; however, If 
major revision is Indicated, two or three more individual tryouts 
should be conducted. 

Small-group tryout is then conducted on 6 to 10 students 
who represent the range of ability and background of the target 
population. Criterion test data are again used to locate trouble 
spots and revision is made. At this point, 85% of the students 
should be performing correctly on the criterion test. 

Final tryout is made on a large group of 30 to 50 students 
under conditions which approximate actual training. This tryout 
is conducted by the curriculum designer along with the Instructor(s) . 
A group this size is needed to verify (or validate) previous 
design results. Final revision is made following this tryout. 

11 • Implement and field test the system — This is done 
under actual classroom conditions. The Instructor's role in the 
instructional system is explicated at this point, and the 
Instructor's Manual is developed. His role becomes that of 
manager and facilitator of learning. His tasks are as follows: 
(1) dlagnode individual learning needs; (2) prescribe learning 
experiences ; (3) provide proper materials and equipment at right 
time; (4) test and evaluate individual progress; (5) compile 
individual and group progress records; (6) provide tutorial and 
counseling help; (7) provide motivational reinforcement; 
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(8) provide supplementary materials and experiences; (9) coordinate 
individual, small-group, and large-group learning activities; 

(10) coordinate use of learning materials and equipment; and 

(11) evaluate feedback data on effectiveness of learning. 

The Instructor's Manual should contain: (1) course description 
(2) student population description; (3) performance objectives; 
(A) criterion tests; (5) system performance data; and (6) sugges- 
tions for administering the system. 

Field testing is the final phase of the systems development 
process. This means the program is monitored, evaluated, and 
subsequently revised continuously for as long as it is in use. 
This phase may be more appropriately referred to as system 
"institutionalization." Constant monitoring and analysis of 
criterion test data will continue to point the way for needed 
revision. 

Butler points out that a training system is never "finished," 
rather, it is constantly "in process." 

12. Follow-up graduates At' this point, effective 

guidance and placement are brought into play. Longitudinal planning 
for follow-up at 1-year, 3-year, 5-year, or 10-year intervals 
may be started at this point. Follow-up to obtain details of 
occupational patterns , changes in needed competencies » job 
adjustment problems, and work satisfaction indices, all can be 
used to improve the instructional system. 
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CRITERION-REFERENCED TESTS 
Much has been said in the foregoing material about criterion tests, 
or criterion-referenced tests (CRT). The CRT is central to all vali- 
dation efforts* 

In curriculum development we are concerned with criterion-referenced 
tests, whereas, in traditional test development, the concern has been 
and still is with i^orm-referenced tests (NRT). A simple distinction 
should be made here between the CRT and the NRT. 

The NRT is the more traditional type of test and is used to 
identify an individual's performance in relation to the performance 
of others on the same measure. Hence, the NRT is viewed as a relative 
measure. The CRT, on the other hand, is used to identify an individual's 
status with respect to an established standard, or criterion, of per- 
formance. The CRT, therefore, is viewed as an absolute measure. 
Curriculum developers are concerned with getting an individual person 
trained proficiently according to a predetermined set of absolute 
criteria, rather than training him relative to the performance of 
other individuals. 

CRTs can be devised for use in making decisions both about 
individuals and instructional programs . Concerning decisions about 
individuals > one might use a CRT to determine whether a student had 
mastered a criterion skill that is prerequisite to starting a new 
training program, or a new sequence within a program. Concerning 
instructional programs > a CRT could be designed that would reflect 
attainment of objectives based on a hierarchial sequence. The CRT 
could be administered to learners after they completed the sequence. 
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and, after analysis, the efficacy of the sequence might be determined. 

The CRT and Reliability Validity of the curriculum generally, 

and of the CRT specifically, cannot be considered apart from reliability. 
This implies that the CRT must be internally consistent, i.e., CRT 
items must be similar to stated behavioral objectives in terms of 
what they are measuring. Although traditional statistical procedures 
for determining reliability coefficients are not necessarily appropriate 
for CRTs, it is thought at this time that coefficients which are derived 
by considering both a pre-instruction assessment and a post-instruction 
assessment as part of the same extended phenomenon might- yield more 
meaningful reliability estimates. 

An ideal curriculum component, package, program, unit, lesson, etc., 
should result in perfect learning on the part of all individuals. 
While individuals may differ in thn amount of time required to go 
through a curricultmi coiiq)onent, once they have completed it, all 
should have mastered the content. From this point of view a good 
program should result in little variability for a measure of learning. 
One might suppose, then, that variability of G scores (gain between 
pre-test and post-test) would be a criterion that could bci used in 
assessing programs such that the less the variability, the i>?tter 
the curriculum component. (It should be recalled by th6 reader Ur^ii 
correlation coefficients derived by traditional statistical methods • 
rely on variability.) 

The above is merely mentioned here in the event the curriculum 
designer has. the time and inclination to work toward impirical 
reliability estimates. 
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In the Initial design stages , the designer takes the objectives 
and recasts them as Items on the CRT. If the objectives have been 
derived from accurate job analyses, then they should have job validity, 
and consequently, test Items geared to these objectives also should 
have true validity. Reliability of the CRT will depend upon job- 
objective-test behavior that Is observable and measurable. To Improve 
test reliability, a preliminary check can be made on two different 
groups of 30 to 50 persons: (1) rmtralned-tmskllled persons, who might 
be entering students, and (2) trained-skilled persons who might have 
been on the job for less than six months. (This procedure was 
outlined In one of the steps In Butler's model.) Comparisons of 
scores of the two groups will yield a correlation coefficient of 
reliability. The reliability check may require major revision of the 
entire test, but each Item should be treated separately, since a 
composite test reliability coefficient will not pinpoint the specific 
Items that need revision, whereas an Item-by-ltem analysis will. 

In the case where a curriculum is being developed for a new or 
emerging job or career, non-availability of trained-skilled persons 
for purposes of determining an in-deslgn system reliability estimate 
would prevent the use of the above approach. On the other hand, for 
those ongoing currlculums that are being subjected to continuous 
revision and study, the above approach to determining reliability 
would seem to be a tenable one. This technique is suggested for 
consideration despite the fact that it may be time consuming and 
somewhat costly. At the risk of sotmdlng trite, the curriculum 
designer is reminded that funds and time expended early in the 
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blueprint or design stages may result In larger savings later on as 
training takes place. 

The CRT and Validity Procedures used In assessing the validity 

of the CRT suffer similar difficulties found In assessing reliability. 
Validity of NRTs Is based on correlations, hence on variability, the 
search being usually for coefficients that exceed +.60. However, CRT 
coefficients of less than +.60, and even those with negative coefficients, 
are not necessarily devastating. CRT items are validated primarily 
In terms of the adequacy with which they represent the criterion stated 
in the objective. Adequacy of content is especially Important for 
tests that measure outcomes of education or training. Hence, content 
validity approaches may have some application to CRT test validation. 
In content validity we determine skills, knowledge, and understanding 
that comprise the correct behavior we are seeking in students, then 
translate these to objectives and construct a test or tests to 
measure attainment or achievement. Finally, we match the analysis 
of test content against the analysis of instructional program cont:ent 
and objectives and see how well the former represents the latter. To 
the extent that our objectives are represented in the test, the test 
is valid. 

The major focua of validating the CRT is to show that its items 
are a representative sample of all aspects and facets of the beha'/lor 
prescribed in the objective. This means there may be. pencil^apex 
items pertaining to skills and knowledges. It also means that there 
may be items which measure performance. Responses to pencil-^paper 
tests are easier to obtain than responses to tests of performance. 
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Performance tests usually call for responses that require all the 
dimensions of behavior, such as speed, accuracy, Judgment, etc* 

The CRT and Item Analysis — Item analysis procedures have 
traditionally focused on pinpoint test Items on NRTs that do not 
discriminate among persons who take the test* Faulty. Items woxild be 
those which are too easy, too hardt or ambiguous* Both positively 
and negatively discriminating Items for CRTs may pinpoint ateas of 
instruction which need revision* However, negatively discriminating 
items are the ones which should be identified, but identifying them 
will depend on the ease with which analyses can be conducted* This 
usually demands sophisticated data-processing techniques. 

Webster and McLeod present an excellent technique for item 
analysis of a module test which can be used to perform item analyses 
on CRTs. 

SUMMARY 

The foregoing material has attempted to present a rationale for 
validating curriculums in vocational and technical education* A 
systems approach was used to present an orderly approach to validation 
discussions. 

The currictilum designer may have concluded that validation efforts 
are extraordinarily time consuming and require test and measurement 
expertise not ordinarily found among curriculum staff members* Neverthe- 
less, validation procedures as outlined in this paper proceed in an 
orderly fashion, building on each preceding step. The result is a 
curriculum package which can be identified as being sound and 
productive of persons who can perform* 
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